Data Repository Organization and Recuperation Process for Multilingual Lexical Databases1

نویسندگان

  • Jérôme Godard
  • Mathieu Mangeot
  • Frédéric Andrès
چکیده

This paper describes the data management in multilingual lexical databases. Since NLP systems are using lexical data, the amount of work to build them is huge. That is the reason why it is important to use rigorous powerful systems and to be able to get data for a minimal cost. As an open source project, Papillon is reusing existing lexical data and wants to make volunteers collaborate. To make it possible, it combines state of the art concepts in the field of linguistic and computing.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multilingual Lexical Network from the Archives of the Digital Silk Road

We are describing the construction process of a specialized multilingual lexical resource dedicated for the archive of the Digital Silk Road DSR. The DSR project creates digital archives of cultural heritage along the historical Silk Road; more than 116 of basic references on Silk Road have been digitized and made available online. These books are written in various languages and attract people...

متن کامل

Building Specialized Multilingual Lexical Graphs Using Community Resources

We are describing methods for compiling domain-dedicated multilingual terminological data from various resources. We focus on collecting data from online community users as a main source, therefore, our approach depends on acquiring contributions from volunteers (explicit approach), and it depends on analyzing users’ behaviors to extract interesting patterns and facts (implicit approach). As a ...

متن کامل

From Resources to Applications. Designing the Multilingual ISLE Lexical Entry

The ISLE Computational Lexicon Working Group is committed to the consensual definition of a standardized infrastructure to develop multilingual resources for HLT applications. In particular, the ISLE-CLWG pursues this goal by designing MILE (Multilingual ISLE Lexical Entry), a general schema for the encoding of multilingual lexical information. This has to be intended as a meta-entry, acting as...

متن کامل

The Habanera Lexical Knowledge Base Management System

Habanera is a multipurpose multilingual lexical knowledge base that is developed at CRL to be used as a central repository of multilingual lexical data. The knowledge base contains a set of dictionaries and relations between entries, within a dictionary (e.g., synonymy) as well as between entries of different dictionaries (e.g., translation). The format of monolingual lexical entries is left re...

متن کامل

Interchanging Lexical Information for a Multilingual Dictionary

OBJECTIVE To facilitate the interchange of lexical information for multiple languages in the medical domain. To pave the way for the emergence of a generally available truly multilingual electronic dictionary in the medical domain. METHODS An interchange format has to be neutral relative to the target languages. It has to be consistent with current needs of lexicon authors, present and future...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002